Search CORE

169 research outputs found

Generating and Sampling Orbits for Lifted Probabilistic Inference

Author: Broeck Guy Van den
Holtzen Steven
Millstein Todd
Publication venue
Publication date: 01/01/2019
Field of study

A key goal in the design of probabilistic inference algorithms is identifying and exploiting properties of the distribution that make inference tractable. Lifted inference algorithms identify symmetry as a property that enables efficient inference and seek to scale with the degree of symmetry of a probability model. A limitation of existing exact lifted inference techniques is that they do not apply to non-relational representations like factor graphs. In this work we provide the first example of an exact lifted inference algorithm for arbitrary discrete factor graphs. In addition we describe a lifted Markov-Chain Monte-Carlo algorithm that provably mixes rapidly in the degree of symmetry of the distribution

arXiv.org e-Print Archive

eScholarship - University of California

Probabilistic Program Abstractions

Author: Broeck Guy Van den
Holtzen Steven
Millstein Todd
Publication venue
Publication date: 01/01/2017
Field of study

Abstraction is a fundamental tool for reasoning about complex systems. Program abstraction has been utilized to great effect for analyzing deterministic programs. At the heart of program abstraction is the relationship between a concrete program, which is difficult to analyze, and an abstract program, which is more tractable. Program abstractions, however, are typically not probabilistic. We generalize non-deterministic program abstractions to probabilistic program abstractions by explicitly quantifying the non-deterministic choices. Our framework upgrades key definitions and properties of abstractions to the probabilistic context. We also discuss preliminary ideas for performing inference on probabilistic abstractions and general probabilistic programs

arXiv.org e-Print Archive

eScholarship - University of California

Symbolic Exact Inference for Discrete Probabilistic Programs

Author: Broeck Guy Van den
Holtzen Steven
Millstein Todd
Publication venue
Publication date: 01/01/2019
Field of study

The computational burden of probabilistic inference remains a hurdle for applying probabilistic programming languages to practical problems of interest. In this work, we provide a semantic and algorithmic foundation for efficient exact inference on discrete-valued finite-domain imperative probabilistic programs. We leverage and generalize efficient inference procedures for Bayesian networks, which exploit the structure of the network to decompose the inference task, thereby avoiding full path enumeration. To do this, we first compile probabilistic programs to a symbolic representation. Then we adapt techniques from the probabilistic logic programming and artificial intelligence communities in order to perform inference on the symbolic representation. We formalize our approach, prove it sound, and experimentally validate it against existing exact and approximate inference techniques. We show that our inference approach is competitive with inference procedures specialized for Bayesian networks, thereby expanding the class of probabilistic programs that can be practically analyzed

arXiv.org e-Print Archive

eScholarship - University of California

Overfitting in Synthesis: Theory and Practice (Extended Version)

Author: Millstein Todd
Nori Aditya
Padhi Saswat
Sharma Rahul
Publication venue
Publication date: 01/01/2019
Field of study

In syntax-guided synthesis (SyGuS), a synthesizer's goal is to automatically generate a program belonging to a grammar of possible implementations that meets a logical specification. We investigate a common limitation across state-of-the-art SyGuS tools that perform counterexample-guided inductive synthesis (CEGIS). We empirically observe that as the expressiveness of the provided grammar increases, the performance of these tools degrades significantly. We claim that this degradation is not only due to a larger search space, but also due to overfitting. We formally define this phenomenon and prove no-free-lunch theorems for SyGuS, which reveal a fundamental tradeoff between synthesizer performance and grammar expressiveness. A standard approach to mitigate overfitting in machine learning is to run multiple learners with varying expressiveness in parallel. We demonstrate that this insight can immediately benefit existing SyGuS tools. We also propose a novel single-threaded technique called hybrid enumeration that interleaves different grammars and outperforms the winner of the 2018 SyGuS competition (Inv track), solving more problems and achieving a

5\times

mean speedup.Comment: 24 pages (5 pages of appendices), 7 figures, includes proofs of theorem

arXiv.org e-Print Archive

eScholarship - University of California

Recommended from our members

Don't mind the gap: Bridging network-wide objectives and device-level configurations

Author: Beckett Ryan
Mahajan Ratul
Millstein Todd
Padhye Jitendra
Walker David
Publication venue: eScholarship, University of California
Publication date: 08/11/2019
Field of study

We reflect on the historical context that lead to Propane, a high-level language and compiler to help network operators bridge the gap between network-wide routing objectives and low-level configurations of devices that run complex, distributed protocols. We also highlight the primary contributions that Propane made to the networking literature and describe ongoing challenges. We conclude with an important lesson learned from the experience

eScholarship - University of California

FlashProfile: A Framework for Synthesizing Data Profiles

Author: Gulwani Sumit
Jain Prateek
Millstein Todd
Padhi Saswat
Perelman Daniel
Polozov Oleksandr
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/10/2018
Field of study

We address the problem of learning a syntactic profile for a collection of strings, i.e. a set of regex-like patterns that succinctly describe the syntactic variations in the strings. Real-world datasets, typically curated from multiple sources, often contain data in various syntactic formats. Thus, any data processing task is preceded by the critical step of data format identification. However, manual inspection of data to identify the different formats is infeasible in standard big-data scenarios. Prior techniques are restricted to a small set of pre-defined patterns (e.g. digits, letters, words, etc.), and provide no control over granularity of profiles. We define syntactic profiling as a problem of clustering strings based on syntactic similarity, followed by identifying patterns that succinctly describe each cluster. We present a technique for synthesizing such profiles over a given language of patterns, that also allows for interactive refinement by requesting a desired number of clusters. Using a state-of-the-art inductive synthesis framework, PROSE, we have implemented our technique as FlashProfile. Across

153

tasks over

75

large real datasets, we observe a median profiling time of only

\sim\,0.7\,

s. Furthermore, we show that access to syntactic profiles may allow for more accurate synthesis of programs, i.e. using fewer examples, in programming-by-example (PBE) workflows such as FlashFill.Comment: 28 pages, SPLASH (OOPSLA) 201

arXiv.org e-Print Archive

eScholarship - University of California

The Silently Shifting Semicolon

Author: Marino Daniel
Millstein Todd
Musuvathi Madanlal
Narayanasamy Satish
Singh Abhayendra
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 1st Summit on Advances in Programming Languages (SNAPL 2015)
Publication date: 01/01/2015
Field of study

Memory consistency models for modern concurrent languages have largely been designed from a system-centric point of view that protects, at all costs, optimizations that were originally designed for sequential programs. The result is a situation that, when viewed from a programmer\u27s standpoint, borders on absurd. We illustrate this unfortunate situation with a brief fable and then examine the opportunities to right our path

Dagstuhl Research Online Publication Server

What do LLMs need to Synthesize Correct Router Configurations?

Author: Beckett Ryan
Millstein Todd
Mondal Rajdeep
Tang Alan
Varghese George
Publication venue
Publication date: 10/07/2023
Field of study

We investigate whether Large Language Models (e.g., GPT-4) can synthesize correct router configurations with reduced manual effort. We find GPT-4 works very badly by itself, producing promising draft configurations but with egregious errors in topology, syntax, and semantics. Our strategy, that we call Verified Prompt Programming, is to combine GPT-4 with verifiers, and use localized feedback from the verifier to automatically correct errors. Verification requires a specification and actionable localized feedback to be effective. We show results for two use cases: translating from Cisco to Juniper configurations on a single router, and implementing no-transit policy on multiple routers. While human input is still required, if we define the leverage as the number of automated prompts to the number of human prompts, our experiments show a leverage of 10X for Juniper translation, and 6X for implementing no-transit policy, ending with verified configurations

arXiv.org e-Print Archive